DOCUHEHT RESUME 



ED 129 878 



TM 005 680 



AUTHOR 
TITLE 



PUB DATE 
MOTE 



EDRS PRICE 
DESCRIPTORS 



O^Reilly Robert P» ; Streeter, Ronald E. 

Reports on the Development and Validation of a System 

for Measuring Literal Comprehension in a 

Multiple-Choice Cloze Format: Preliminary Factor 

Anal y tic Result s • 

Mar 76 

33p»; Paper presented at the Annual Meeting of the 
American Educational Research Association (60th, San 
Francisco, California, April 19-23, 1976) 

MF-$0*83 HC-$2*06 Plus Postage* 
♦Cloze Procedure; Elementary Education; *Factor 
Analysis; *Multiple Choice Tests; *Reading 
Comprehension; Reading Tests; *Test Validity 



ABSTRACT 

The results of a series of factor analyses of a new 
test of literal comprehension using a multiple-choice cloze format 
are summarized* These analyses vere conducted in the validation of a 
test design to measure for the most part a factor of literal 
comprehension independent of IQ and inferential reading processes, 
yet marked by certain related types of test items included in 
standardized and other measures of literal comprehension* In this 
study, the Multiple-Choice Cloze (MCC) test was administered to a 
sample of 3,125 students in grades one to six in a medium-sized urban 
school district in conjunction with its annual standardized testing 
program. Besides the MCC, other measures included in the analyses 
were an alternate measure of literal comprehension based on Bormuth*s 
wh-item, a measure of passage independence based on wh-items, the 
Short Form Test of Academic Aptitude, and the California Achievement 
Test* The factor analytic results support the conclusion that the MCC 
measures literal comprehension, a trait that is essentially 
independent of IQ* However, it was also determined that the MCC had 
minor loadings on a second, and possibly a third, component related 
to IQ, inferential reading skills, and language mechanics* 
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Abstract . 

This report suiranarizes the results of a series of factor analyses of a new 
test of literal comprehension using a multiple-choice cloze format. These 
analyses were conducted in the validation of a test designed to measure for 
the most part ai factor of literal comprehension independent of IQ and in- 
ferential reading processes, yet marked by certain related types of test 
items included in standardized and other measures of literal comprehension. 
Ill this study, the Multiple-Choice Cloze (MCC) test was administered to a 
sample of 3,125 students in grades 1-6 in a medium^sized urban school 
district in conjunction with its annual standardized testing program. Besides 
the MCC, other measures included in the analyses were an alternate measure 
of literal comprehension based on Borrauth's wh-item, a measure of passage 
independence based on wh-items, the Short Form Test of Academic Aptitude, 
and the California Achievement Test. The factor analytic results support 
the conclusion that the MCC measures literal comprehension, a trait that is 
essentially independent of IQ. However, it was also determined that the MCC 
had minor loadings on a second, and possibly a third, component related to 
IQ, inferential reading skills, and language mechanics. 



This report presents the results of a series of e9q)loratory factor 
analyses of a new test of reading conq>rehenslon using a multlple«-cholce 
cloze formats These analyses are part of a preliminary examination of data 
gathered on the test In an administration to more than 5,000 students In 
grades 1-9 In May 1975. 

This test development project is concerned with the design and valida- 
tion of a test of reading comprehension with certain properties that 'twould 
tend to Improve the utility of comprehension testing in the schools. First, 
Instead of providing a fixed test, the intent was to construct a pool of 
scaled passages and items that could be used to assemble n tests of reading 
comprehension for a given evaluation purpose with any student or group in 
grades 1-1 2« Secondly, the test was to be a measure of literal comprehension 
or language comprehension per je, as opposed to extant measures of reading 
comprehension \dilch seem to be psychologically synonymous with higher order 
reasoning processes (Singer, 1973; Thorndike, 1973-74)« Thirdly, the test 
was to be domain-referenced in the sense that any test assembled from the 
item and passage pool would represent an unbiased sampling of one or more 
universes of written discoursee Fourthly, the test was to be based on 
objective-generative item construction procedures (see Hlvely, 1974} so that 
the test construction technology could be economically and reliably 
reproduced by others* 

The standard cloze was initially selected as the item format that 
offered the most possibilities for building the required test of reading 
comprehension* Cloze tests are highly passage dependent (the student has 
virtually no chance of responding correctly unless he reads the passage)* 
The cloze item format offers an objective procedure for the construction of 
comprehension items— one that can be systematically and widely applied to 



samples of written discourse* The Item format Is also generally coherent 
with the ongoing act of reading comprehension if viewed as a constructive 
language process (Ryan and Semmel, 1969; Smith, 1975)» Since there are no 
questions in a cloze item, the test passage remains unaffected by the 
idiosyncracies of the Item writer* Finally, and of utmost importance for 
the construction of a specific measure of comprehension, various deletion 
strategies allow for the manipulation of the interaction between reader and 
test passage such that the contributions of syntactic, semantic, and reasoning 
factors may be controlledo 

Although the foregoing features of the standard cloze represent im- 
portant advantages, they are considerably offset by problems with validity 
and ^plication* On the side of application problems, the standard cloze 
fomat is not readily perceived as a test of reading comprehension, and the 
first large-scale attempt to apply the technique as a survey test in the 
schools resulted in serious difficulties with interpretation and use of the 
data (Hansen and Hesse, 1974)« J^parently, the standard cloze is also an 
extremely difficult and anxiety- Invoking test (Granney, 1972; Rankin, 1974)« 
The required length of a cloze passage makes it inconventiently long as a 
unit of test assembly^ And, the test has the notable disadvantage of 
requiring hand scoring* 

On the Issue of validity, Bomuth (1969) states, ')iuch of the research 
has shown that scores on cloze tests are highly correlated with scores on 
standardized tests of reading comprehension ability," but actually iTevlews 
of the literature enqphasize low to moderate correlations (Potter, 1968; 
O'Reilly et al», 197 6)« There is also a strong indication in the literature 
that the correlation between cloze scores and comprehension scores on 
standardized tests is substantially attributable to the concentration of 



IQ In both tests (RatJcln, 1974). 

Some studies of the validity of the cloze as a measure of reading 

comprehension indicate that the standard or any-word deletion pattern 

unduly weights the syntactic con^onent in a test passage at the e3q>ense 

of the semantic con^)onent (Taylor, 1953; Louthan, 1965; Bickley, Veaver, 

and Ford, 1968; and Rankin, 1974)# It also appears that responses to the 

deletions in a standard cloze test passage are chiefly dependent upon a 

surrounding context of 5-10 words, (Taylor, 1956; MacGinitie, 1961), 

suggesting insensitivity of the test to the larger ideas that may run 

through the passage (Carroll, 1972). Finally, it may be fairly said that 

cloze research has generally not been well designed to explore the issue 

of the validity of the construct underlying the test, as Ohnmacht, Weaver, 

and Kohler (1970) have remarked: 

The fact that responses to cloze tasks reflecting essentially 
gross deletion strategies align themselves with crude mea- 
sures of comprehension does little to shed light upon the 
fundamental nature of con;>rehenslon other than to indicate 
that one can measure vAi&t passes for comprehension in more 
than one way « • Researchers using the cloze procedure 
ought to give careful consideration to language operations 
and to rational operations ^ich are Implicit in verbal 
activity and they should construct deletion patterns tdiich 
seem to relate to these operations* Rather than standardizing 
a particular cloze deletion type, e^qploratlon of a wider 
range of deletion types \^lch are related to particular 
linguistic and psychological hypotheses is needed« (pp« 215- 
216) 

The present work attempted to meet the esdiortatlons of Ohnmacht et al* 
In a rational redesign of the standard cloze item fomat as a measure of 
reading coc^rehension* This study is ^parent ly the first attenpt to 
esqilicitly design a comprehension item to tap the "pure" con^rehension of 
language factor distinguished by Carroll (1972) from the high level 
Inferential processes that weigh heavily in standard ^^xed tests of reading 
cooqprehension* This comprehension or language factor, referred to here as 



literal comprehension, is briefly defined as the "apprehension of the 
granmatical and semantic relations vjhich obtain within and among the 
sentences of the discourse" (Katz and Fodor, 1963, p« 172). The elaboration 
of the construct given elsewhere (Sehuder, Kidder, and O'Reilly, 1976), 
assumes that literal comprehension is essentially independent of IQ and 
is marked by certain types of tests or items Included in standardized and 
other measures of reading comprehension, including some vocabulary measures-- 
particularly those '^shich focus on interpretation of word meanings in context, 
factual questions about passages, questions about e3q>licit details, 
questions about implications or entailment relations \diich hold within a 
passage, paraphrase questions, and certain types of main idea and title 
questions* 

The Modified Cloze Fomat: Characteristics and Rationale 
The cloze item format designed in this research is technically referred 
to as an item form (Hively, 1974) that Is generally suitable for processing 
brief, coherent passages into multiple-choice cloze (MCC) items* The MCC 
passage format, as shown in Figure 1, resembles a standard cloze passage 
attenuated in length* The passages are generally about 70-80 words long 
(In grades 1 and 2 they range from 25-45 words), and they ejdiibit the standard 
deletion rate of approximately every fifth word* However, inspection of 
Figure 1 shows that the MCC item fom is a lexical cloze: Only nouns, 
verbs, adjectives, and adverbs have been deleted in contrast with the 
standard or any- word approach \^ich results in deletion of both structural 
and lexical items* 
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Place Figure 1 about here 



Below the test passage appears a set. of 5 response choices for each 
deletion, one of \diich is. the exact word deleted from the passageo All 
distractorsy it may be noted^ would be grammatically plausible in the 
position of the deleted word for v^ich they function as distractors* 
Distractors for each deletion are generated by a computer program that 
randomly accesses sets of words from a 12,400 word vocabulary list within 
the constraints of: (a) the graanatlcal class of the deleted word; (b) 
whether the deleted word is a "content" word or is a core (basic) vocabulary 
word; and (c) the grade level of the deleted word« Any given distractor 
thus functions grammatically but not semantically in the position of the 
deleted word^ is at the same graded reading level as the passage source, 
and is a content ^eclf Ic word or a core word as required to match the 
subject matter area to \diich the word belongs* The core or general vocabulary 
lists were conq>iled from Harris and Jacob son's Basic Elementary Reading 
Vocabulary (1972) and the EDL Research and Infonaation Bulletin 5 s A 
Revised Core Vocabulary (Taylor, Frackez^ohl, and Iftiite, 1969)» The 
content-specif ic word lists were compiled from both the Harris and Jacobson 
source and the American Heritage Word Frequency Book (Carroll, Davies, and 
Ricfam^n, 1971)* 

The MCC Item fomat preserves mai^ of the advantages of the cloze 
technique (e«g«, absence of questions and objective item construction) while 
potentially enhancing its ^plicability as a measure of reading comprehension* 
Face validity spears to have been considerably enhanced and the 10-item 
passage unit is a convenient module for the assembly of a test with 5-10 

9 



passages of Increasing difficulty* The excessive difficulty and ambiguity 
of the original cloze testing situation appears to have been considerably 
reduced* In fact 9 the MCC test should generally suffer less from such 
sources of invalidity as test anxiety because the test passages do not 
function as ordinary test items until the student reaches the point of no- 
comprehension with a passage* 

The validity of the cloze test has been theoretically Improved by 
selectively preserving some of the original features of the technique and 
substantially modifying others* The every fifth-word deletion pattern has 
been maintained because this pezmits the most thorough and objective 
sampling^f the ideas and linguistic structures of the test passage without 
deprivlTig the student of the information necessary to replace the deleted 
words (MacGlnltie, 1961; Ramanuskas, 1972)« The lexical deletion pattern 
should tend to Inq^rove validity in several respects* Nouns^ verbs, adjec- 
tivesy and adverbs carry most of the information In a passage, thus focusing 
the test on the semantic component and on larger semantic units (Fillenbaum, 
.1963), without excluding the syntactic conqponent* According to prior 
research (Taylor, 1956; Rankin, 1974), the lexical deletion pattern should 
also tend to reduce the correlations of the test with IQ %^ile enhancing 
correlations with test scores reflecting comprehension of the more "e9q>licit" 
meanings of a passage* 

The procedure for generating distractors that compete graonatlcally 
but not semantically is specifically designed to limit, insofar as possible, 
the context for interpretation of the test passage* Prior esqperimentation 
with the MCC format indicated that the use of semantically interfering 
distractors would have the effect of introducing a very difficult voca- 
bulary element into the test with the further effect of increasing the 
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correlation of the test with IQ (Cranney, 1972) • The distractor design 
is also intended to enhance test validity by maintaining the passage 
dependency of the test* The use of granmatically equivalent responses 
in the MCC item foxmat should function to eliminate the use of granxnatical 
cues as purely the basis for choosing among distractors* Similarly, the 
inclusion of competing content words among certain groups of distractors 
should tend to eliminate discontinuities in content as a basis for choosing 
among distractors* 

Method 

The issue of the validity of the literal comprehension construct and 
the HOC item foxmat as a measure of the construct was studied in the context 
of the annual standardized testing program of a medium- sized urban school 
district* This school district contributed 2^ 40-minute testing periods 
during ^ich the MCG item format, along with an alternate measure of 
literal comprehension based on Bormuth's (1970) "\rti-item," and a brief 
measure of passage independence based on the xdi-items were administered^ 
These measures, together with measures of veii^al and non-verbal IQ and 
measures of language and reading performance available from the school 
district standardized testing program, provided a matrix of test scores 
suitable for exploring the construct of literal comprehension via factor 
analysis* 

The breadth of the test administration, \Alch ranged across several 
grade levels, and the variety of the tests available in the study pemitted 
consideration of several meaningful questions relating to the l2if)ortance 
and properties of the construct of literal comprehension* Chief among these 
was the question of i^ether factor analysis would verify a substantial 
literal conq>rehension factor that was generalizable across a large number 



of MCC test passages and that was similarly constituted across several grade 
levels of the study population^ In addition it was expected that this 
literal comprehension factor would be essentially independent of the IQ 
and passage independence measures available in the study and would be 
marked by substantial loadings on other reading tests that appear to measure 
a similar factor, or are related to the factor* 
Sample 

The original study sample consisted of 5,722 students in grades 1-9, 
with roughly 500»750 students in each grade level* Students were grouped 
into subsamples for the analysis based on the test levels in the California 
Achievement Test (CAT) battery* The present study is based on the^ response 
data available for the first three subsamples in grades l*-6 \diere IQ scores 
were available* Subsample I consistM of 456 first-graders, subsample II 
had 972 second and third graders, and subsample III had 1697 students in 
grades 4-6* 
Instruments 

The test scores available for the analyses for each subsample are 
listed in clusters In Table 1 under each CAS level, along with descriptive 
data for each score* Test scores that are expected to strongly mark the 
literal coiiq)rehension factor are underlined* A brief description of each 
test score cluster follows* 



Place Table 1 About Here 
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Mult Iple^Cho Ice Cloze Test. The MCC test forms consisted of two sets 
of 12 parallel test forms, one set for grades 1-3 and one set for grades 
4-6. The test forms were systematically assembled from a pool of 353 MCC 
passages drawn from basal readers and literature texts for grades 1«10« 
Each set of forms was assigned a range of readability levelfe In the cloze 
passage pool (passages In the pool are ordered on readability) and within 
these ranges of readability consecutive pairs of readability levels were 
regarded as sampling units (except at grades L-3 vdiere the first two 
readability levels were treated as separate saiqpllng units). Test forms 
for a given grade level range were then constructed by sanpllng without 
replacement 6 ordered passages, one from each consecutive sanq>llng unit* 
The order of passage readability was maintained In the test foniu Each 
test form contained 39, 41, or 60 Items presented to the testee in groups 
of 3, 5, or 10 items. The shorter test forms were at grades 1-3 where the 
first 3 passages in a form were 25-45 words long with 3 or 5 items per 
passage. 

The MCC test yields 4 subscores corresponding to the granmatlcal classes 
of the words deleted in a cloze passage. Due to the distribution of 
gramtnatical classes in the passages, the noun score has the largest mean 
and variance, followed by the verb score and the scores for adjectives and 
adverbs. Internal reliabilities (KR-20) for the MCC test forms in grades 
1-3 and 4-6 ranged from .94-.97, with a median reliability of .96. 

The Wh-Item Test . Because the standardized mea»ire of reading used in 
the study was, in many respects, an ambiguous criterion for the MCC as a 
measure of literal comprehension, an alternative measure of the construct 
was developed. Fbms of this test, called the Wh-Item Test were assembled 
from a pool of some 300 ordered passages and 3,000 main idea and wfa- items 
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using a design for the selection of test passages that was virtually 
Identical to that used for the assembly of the cloze test forms* Five 
Items were selected from the 8 \di-ltems available for each of the 6 
ordered passages In a test form so that there were equal distributions of 
vAi-ltem types across test forms* The vAi-ltem types are: how, \diat (noun), 
what (verb), vAien, irfiere, irfilch, \Ao, \Ay* This procedure resulted In two 
sets of uniform, 30«-ltem tests In each test level that paralleled the 
Multlple--Gholce Cloze tests In number of passages and range of passage 
difficulty* 

The "Wh-Item Test yields 8 subscores corresponding to the wh-ltem types 
represented In the test* Internal reliabilities (KR-20) for the Wh-Item 
Test In grades 1-3 ranged from *90-*94, with a median reliability of *91j 
and in grades 4-6 from •85-* 94, with a median reliability of *93* 

Test Wlseness Test * Because the MCC and Wh-Item Test forms were 
considered to be passage-dependent measures of reading comprehension, a 
special test was constructed to test this assunqptlon* The design of this 
test, referred to as the Test-Wlseness Test, paralleled the MCC and Uh-Item 
Test form designs* The questions, not the passages, in each 'Hh-ltem Test 
form were pooled separately for grades 1-3 and 4-6* A set of 12 test forms 
was then constructed for each grade level range by systematically drawing 
12 items from this pool for each test form* Care was also taken to represent 
the related passage difficulties for the items and the 8 types of ^-Items 
in a test form in an attenq)t to create parallel tests* The relationship 
between scores on this Test-Wlseness measure and scores on the Tlh-Item 
Test provides some indication of the extent to ^ich student's responses 
on the latter test are dependent on reading the associated test passages* 
This test also provides some indication of the extent to ^Ich this form 
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of test-wiseness affects responses on the Multiple-Choice Cloze test* 

The Test-Wiseness Test yields a single score. Intexmal reliabilities 
for the Test-Wiseness Test in grades 1-3 ranged from •13-.79, with a median 
reliability of .68; and in grades 4-6 from .29—76, with a median reliability 
of .70. 

Short Fom Test of Academic Aptitude. The Short Form Test of Acadonic 
^titude (SFTAA) is a group-administered intelligence test that yields 
language and non-language IQ*s. This test, administered by the school 
district to students in grades 1-6, along with the California Achievement 
Test, permitted study of the relationship between IQ and the literal compre- 
hension test across the study subsample. 

California Achievement Test. The various CAT reading and language 
test scores used in the study were previously listed in Table 1 by CAT 
test level. These subscores, rather than the lengthier and more reliable 
CAT skill scores (major headings in Table 1) were used in order to provide 
a less ambiguous basis for marking the e3q>ected literal coiq)rehen8ion 
factor— as opposed to an inferential factor \Aiich might be e3q)ected to appear, 
marked by IQ and such CAT subtests as generalizations and inferences. 
However, preliminary correlational analysis and inspection of the CAT com« 
prehension items Indicated that this approach did not 8ati8fvV:torily resolve 
the CAT coiiq>rehension section into independent literal and non«literal 
subtests* The CAT conqprehension section at every level appeared to be 
overall considerably more "literal" than was expected, in lij^t o*? the 
subtest labels* The supposedly inferential subtests were substantival ly 
contaminated with literal items and vise versa* Both "literal" and "non- 
literal" subtests also contained numbers of items that speared to ha 
passage independent (could be answered without reading the passage). 
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Consequently, in the hopes of further disambiguating the criteria for 
marking literal cotnprehension in the factor analysis, the GAT compre- 
hension items were re-classified into three new subtest scores; (a) items 
that appeared to measure literal comprehension and were passage dependent; 
(b) items that appeared to measure literal comprehension but were 
passage independent; and (c) non-literal items or items that seemed to 
reflect higher order, inferential processes* The CAT subtests based on the 
literal-non-literal and passage-dependent item classification are identified 
as the CAT Item Classification cluster in Table 1* 
Analysis 

The data set available on the foregoing test scores was organized 
separately for analysis by each CAT level identified in Table !• To 
pemlt analysis across the test forms constructed for the study at each 
level, the raw scores for the Wh-Item, MCC and, Test-Wlseness Test forms 
were converted to z scores based on the score distribution for each test 
fom in a test level* Subsequently, negative values were eliminated by 
applying a linear transformation to each set of obtained z scores* The 
resultant scores from any of the foregoing tests in a test level were 
thereafter treated as having come from equivalent test forms and were 
combined as required for the analyses by CAE level* ^ 



This approach to test equating, thougjh somewhat unorthodox, is defensibl 
on several grounds* The general shc^es, means, and the standard deviations 
of th^ distributions of the ^-Item and MCC test scores were very similar 
from fom to fom in a test level (usually the average raw score difference 
from fom to fom was less than one^fourth of a standard deviation), the 
internal reliabilities of each fom were consistently high, and the tests 
had been systematically assembled to be parallel in order and arange of 
readability level* This approach, however, was less defensible for the Test- 
Vlseness Test which varied from fom to fom in reliability and in the 
distribution of scores* 
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The factor analyses were organized in three stages: (1) the first 
stage factor analyzed only the MCC and \^-ltem subscores; (2) the second 
stage then added the conveotlonal CAT subscores and the IQ scores identified 
in Table 1 to the analysis; and (3) the third stage replaced the CAT 
comprehension scores with the CAT Item Classification scores and added the 
Test-Wiseness score* In each stage of the analysis, the various test scores 
were intercorrelated by subsample and the resulting matrices subjected to 
principal components analysis with ones in the diagonals* Components with 
eigenvalues > 1*00 were then rotated to the varimax criterion* These 
analyses were then rerun with only the noun and verb scores used to represent 
the MCC test in the hope of lending further clarification to the results* 
To evaluate the esqpectatlon that the factors in the analyses would be 
correlated, the factor analyses were run again and rotated to the oblique 
criterion* The resulting correlations between the obtained factors are 
of theoretical Interest here, but the factor matrices are not reported 
because the oblique results were nearly Identical to the orthogonal findings* 

Results 

Stage 1 

The results of the factor analyses at stage 1 are given in order by 
subsample ^n Tables 2-4* The rotated factor matrices in the first analysis 
indicate a consistent tendency for the MCC test to r^plit into two factors 
across subsamples: (a) I is the more Important factor and is defined 
primarily by the noun and verb scores and the Wh-Item subscores; and (b) 
II is defined most strongly by the adjectiva lind adverb subscores in gifade 1 
and very strongly by all MCC subscores in the other subsamples* Deleting the 
adjective and adverb subscoreSi as shown in the second set of factor matrices, 
reduces the matrix to one factor and sharply increases the contribution of the 
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cloze noun and verb subscores to I. This analysis was originally designed 
to reveal that II was attributable to the adjective and adverb subscores, but 
actually the pattern of results indicates an increasixig contribution of all 
four MCC subscores to this second cloze component across subsamples^ 



Place Tables 2-4 About Here 



Stage 2 

The results for the stage 2 factor analyses are shown in Tables 3-7 
by subsamplc* Turning to the first factor matrix for grade 1 in Table 5, 
three factors obtain, the first of ^iqb is identifiable as literal conq>re- 
hension, being marked by the GAT Words in Context subtest, the cloze noun 
score, the CAT conq)rehension subtests, the cloze verb score, the various 
Uh-Item subscores, and the GAT subtests for Picture-Word Association, Language 
Mechanics and Language Usage* The last three subtests are not of particular 
Importance in defining factor la As expected, the IQ subscores load at 
very low levels with I* 



Place Tables 5-7 About Here 



The second factor is conqposed largely of the CAE photio logical, ortho* 
graphic and word recognition skills, with moderate loadings on the factor 
for the IQ subscores and the GAT language test score So This factor seems 
to reflect a combination of the pre*reading skills and general verbal 
ability that are ijq)ortant coc^onents in learning to read« 
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The third factor appears to be the second component of the MCC 
Identified in stage 1 \diich can be seen here to be independent of IQ« The 
fourth factor is primarily defined by the CAT Sentence-Picture Association 
sub score* 

Dropping the adjective and adverb subscores in the second factor matrix 
increases the loading of the cloze noun and verb scores in I and eliminates 
the second component of the cloze as in Stage !• In the oblique solution, 
the correlations among the factors were generally low (RI • II == #41 and 
RI • III == eSS), supporting the hypothesis that the literal comprehension 
factor would be essentially independent of inferential processes* 

The pattern of results for Level I is considered to be generally sup- 
portive of theoretical esqpectations although there is the apparent incon- 
sistency of the loading of the GAT Inferences subtest on I and the failure 
of a CAT "inferences" factor to appear in the matrix. These inconsistencies 
seem to be resolved by the fact that inferential processes are represented 
only weakly in the Level I GAT comprehension section, there being only 8 
items thus classified. Moreover, in the process of coin)letlng the GAT Item 
Classification scores, many of these items were seen as doubtful measures 
of Inference. 

The rotated factor matrices for grades 2 and 3 are shown In Table 6. 
The first factor seems to be clearly a literal cooqprehenslon factor with 
moderate to high loadings on the cloze andWh-Item tests, atxl the CAT VovA 
Recognition, Ifords in Context, and Facts subtests. The IQ subtests are 
virtually uncorrelated with Factor I ,and the inferential subtests have 
moderate loadings with the factor. Factor II, which has high loadings for 
all four cloze subscores, spears to be a complex of variables involving 
language skills, IQ, atid virtually all of the vocabulary and ccnqprehenslon 
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sections of the CAT* This factor appears to combine the more than "strictly 
literal" component of the cloze, that is thought to be reflected in the 
tendency of the HCC test to split into a second component^ with the even 
higher order re^asoning processes reflected in the IQ subscores» 

As before, dropping the adjective and adverb component from the test 
score matrix in the second analysis raises the contribution of the cloze 
to but does not otherwise change the interpretation of the results* The 
failure of the second cloze component to be resolved as a factor Independent 
of IQ appears to explain in part the substantial correlation between factors 
I and II (RI • II = •64) in the oblique solution* 

The factor matrices for the grade 4*6 subsample are shown in Table 7, 
and as will be seen, these results are more consistent with e3q>ectations« 
As before, I is clearly interpretable as a literal comprehension factor, 
but here the loadings of the MCC with I are generally higher than in 
previous levels of the analysis* Factors II and III appear to have resolved 
the conglomeration of language and reasoning skills in factor II of the 
previous analysis into two separate factors, each of \^ich has a minor 
cloze contribution* Factor II is primarily a language factor, \^ile III 
is largely an IQ or reasoning factor marked by moderate loadings for the CAT 



Dropping the adjective and adverb score in the seconcl analysis has no 
appreciable effect on the pattern of results* The correlations oiDong the 
factors from the oblique solution were somewhat lower than in the previous 
analysis (RI • 11= .42? RI • 111= •54)* 




vocabulary and Comprehension subtests* 
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Stage 3 

Since the results for stage 3, shown in Tables 8-10, closely parallel 
the findings of the previous level of analysis, they are discussed here as 
a group, with a focus on the possible contribution of the CK£ Item Glassiflca- 
tion to theoretical clarity* In the grade 1 subsample, there is a tendency . 
for the literal comprehension sub score to load on factor I more substantially 
than the non-literal subscore* A similar relationship is found in the grade 
2 and 3 results, but the literal comprehension score also loads about equally 
on factors I and II# In the grade 4-6 subsample, the pattern is somevghat 
more consistent with expectations in that the non-literal comprehension 
subscore loads at a low level with factor I and at a moderate level with 
factor III— the IQ or reasoning factor* However, the literal conqirehenslon 
subscores load about equally with factors I and III* The Test-tflseness score 
added to this stage of the analysis fails to relate substantially to any of the 
factors* 

Place Tables 8-10 About Here 



Discussion 

In retro^ect^ the present study represents a highly complex background 
against \Ak±ch a tentative and still vague conceptualization of literal 
comprehension was esqplored* The analysis of factor structures across 
different age-graded samples and variable test criteria constituted a 
complex Interacting context Involving developing cognitive abllitlesj shifts 
in the psycholingulstic meaning of the test criteria used, resulting very 
likely In changes in the types of skills tapped; and changes within and 
between subsamples in the demands made by the MCC and Itti-Item fomats on 
students' syntactic and semantic competenceo It is unreasonable to e3q>ect 
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any clean set of results given this context for exploration and, certainly, 
a somewhat mixed set of results ensued. However, the results seem to be 
sufficiently consistent to conclude that the conceptualization for the study 
is in the right direction and to further offer a few tentative generalizations. 

The data appear to support the conclusion that the MCC format is in 
part a measure of a restricted form of reading comprehension that is 
essentially independent of IQ. This fom of comprehension appears to be 
interpretable as the apprehension of the "strictly literal" meanings con- 
tained in sentences and phrases as measured by reading tests that focus on 
factual questions, questions about ejqplicit details, and questions about 
interpretation of meanings within the context of isolated sentences and 
phrases. More tenuously related to the data is the conclusion that the 
MCC format is con^)osed of a second and possibly a third coiH)onent that 
reflects other than "strictly literal" comprehension processes. It was 
apparently too much to expect that appropriate criteria elucidating this 
second component of the cloze would be found in the CAT, even with an 
arduous re-classification of the comprehension items in the test. 

The next stage of research on the MCC fomat must obviously be concerned 
with the development of a broader range of test or performance criteria 
specifically designed to tap the more expansive in5)lications of the MCC 
format as a broad and generalizable measure of literal conqprehension. 
Recent progress In clarifying the construct of literal coup rehens ion in 
Schuder et aU (1976), beyond the admittedly crude conceptualization that 
guided the re«classlf ication of CAT comprehension items used here, provides 
a number of Important leads for constructing these test criteria. In 
addition, research along these lines must be concerned with measuring the 
syntactic and semantic demands made by the test passages on the testee© > 
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Clearly, the complexity of the test passages in a cloze test will Influence 
the correlations between the test and other reading and cognitive performance 
criteria* 

The findings presented here further demonstrate the futility, from a 
theoretical point of view, of correlating cloze test scores with overall 
scores from standardized measures of reading comprehension or with similar 
home-gzown measures* Judging by the CAT, such tests are a complex collection 
of item types, \diose psycho llngustic nature is not especially revealed 
by the test maker** That the psycho linguistic ambiguity of the CAT may 
generalize, at least in part, to the products of other major reading test 
makers was shown in a recent study by Tuinmau (1973-74). Tuirman^i analysis 
showed that several well-known standardized comprehension tests had 
substantial numbers of items that were not passage dependent— a problem that 
was clearly apparent in the detailed examination of the CAE in the present 
study* 

Finally, it seems that the present study provides tentative support 
for Carroll's (1972) contention that language conqprehension, or literal 
comprehension as it is called here, could be isolated from inferential or 
reasoning processes by more careful test construction procedures* Identified 
as a factor(s) in the present study, literal conqirehenslon accounted for 
well more than a minuscule proportion of the variance of the various tests 
used In the factor analyses* In addition, the contention that literal 
coaq>rehension would be essentially independent of IQ was largely supported 
by the data, particularly if the rationale concerning the literal versus 
non-literal content of the CAT is accepted* Other studies hove found much 
higher correlations among factors that are prestxned to make up tests of 
conprehension and reasoning (Boxmuth, 1969; Davis, 1968; Spearritt, 1972)— i 
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so much so that reading coiq)rehenslon and reasoning have been equated 
(Thomdlke, 1973-74)* 

Carroll's hypothesis Is thus very much worthy of further Investigation, 
particularly since the bifurcation of reading skills Into two basic processes 
has broad implications for reading Instructloiu 
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